A Study of Interestingness Measures for Associative Classification on Imbalanced Data
نویسندگان
چکیده
Associative Classification (AC) is a well known tool in knowledge discovery and it has been proved to extract competitive classifiers. However, imbalanced data has posed a challenge for most classifier learn ing algorithms including AC methods. Because in the AC process, Interestingness Measure (IM) p lays an important role to generate interesting rules and build good classifiers, it is very important to select IMs for improving AC’s performance in the context of imbalanced data. In this paper, we aim at improving AC’s performance on imbalanced data through studying IMs. To achieve this, there are two main tasks to be settled. The first one is to find which measures have similar behaviors on imbalanced data. The second is to select appropriate measures. We evaluate each measure’s performance by AUC which is usually used for evaluation of imbalanced data classification. Firstly, based on the performances, we propose a frequent correlated patterns mining method to extract stable clusters in which the IMs have similar behaviors. Secondly, we find 26 proper measures for imbalanced data after the IM ranking computation method and divide them into two groups with one especially for extremely imbalanced data and the other suitable for slightly imbalanced data.
منابع مشابه
On Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملRole of Interestingness Measures in CAR Rule Ordering for Associative Classifier: An Empirical Approach
Associative Classifier is a novel technique which is the integration of Association Rule Mining and Classification. The difficult task in building Associative Classifier model is the selection of relevant rules from a large number of class association rules (CARs). A very popular method of ordering rules for selection is based on confidence, support and antecedent size (CSA). Other methods are ...
متن کاملOn Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملIncreasing the Interpretability of Rules Induced from Imbalanced Data by Using Bayesian Confirmation Measures
Approaches to support an interpretation of rules induced from imbalanced data are discussed. In this paper, the rule learning algorithm BRACID dedicated to class imbalance is considered. As it may induce too many rules, which hinders their interpretation, their filtering should be applied. We introduce three different post-pruning strategies, which aim at selecting rules having good descriptive...
متن کاملGeneric Associative Classification Rules: A Comparative Study
Associative classification is a supervised classification approach, integrating association mining and classification. Several studies in data mining have shown that associative classification achieves higher classification accuracy than do traditional classification techniques. However, the associative classification suffers from a major drawback: The huge number of the generated classificatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015